A Generative Approach for Multi-Document Summarization using the Noisy Channel Model
نویسندگان
چکیده
Multi-document summarization is the automatic production of a unique summary from a collection of texts. This task has become very important, since it assists the information processing in days where the amount of information is growing considerably. In this paper, we propose a statistical generative approach for multi-document summarization. In particular, we formulate the multi-document summarization task using a Noisy-Channel model. This approach is novel for multi-document summarization and it explores the process of summarization through the analysis of factors, such as redundancy, complementarity and contradiction. In this work, we model these factors using the Cross-document Structure Theory.
منابع مشابه
A Generative Approach for Multi-Document Summarization using Semantic-Discursive information
Multi-document summarization is the automatic production of a unique summary from a collection of texts. In this paper, we propose a statistical generative approach for multi-document summarization that combines simple information such as sentence position in the text and semantic-discursive information from CST (Cross-Document Structure Theory). In particular, we formulate the multi-document s...
متن کاملMulti-candidate reduction: Sentence compression as a tool for document summarization tasks
This article examines the application of two single-document sentence compression techniques to the problem of multi-document summarization—a “parse-and-trim” approach and a statistical noisy-channel approach. We introduce the Multi-Candidate Reduction (MCR) framework for multi-document summarization, in which many compressed candidates are generated for each source sentence. These candidates a...
متن کاملA Hybrid Hierarchical Model for Multi-Document Summarization
Scoring sentences in documents given abstract summaries created by humans is important in extractive multi-document summarization. In this paper, we formulate extractive summarization as a two step learning problem building a generative model for pattern discovery and a regression model for inference. We calculate scores for sentences in document clusters based on their latent characteristics u...
متن کاملExtractive summarization using a latent variable model
Extractive multi-document summarization is the task of choosing sentences from a set of documents to compose a summary text in response to a user query. We propose a generative approach to explicitly identify summary and non-summary topic distributions in the sentences of a given set of documents (i.e., document cluster). Using these approximate summary topic probabilities as latent output vari...
متن کاملEXTRACTION-BASED TEXT SUMMARIZATION USING FUZZY ANALYSIS
Due to the explosive growth of the world-wide web, automatictext summarization has become an essential tool for web users. In this paperwe present a novel approach for creating text summaries. Using fuzzy logicand word-net, our model extracts the most relevant sentences from an originaldocument. The approach utilizes fuzzy measures and inference on theextracted textual information from the docu...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011